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(57) L 'invention est un systeme de codage de paroles 
pouvant fonctionner a des debits binaires inferieurs a 
4,8 kb/s avec une grande qualite vocale. Les signaux 
vocaux sont divises en blocs qui sont eux-memes divis^s 
en sous-blocs. Un calculateur est utilise pour calculer des 
parametres spectraux representant les caracteristiques 
spectrales des signaux vocaux dans un sous-bloc au 
moins et un quantificateur est utilise pour quantifier les 
parametres spectraux d'un sous-bloc au moins 
pres^lectionne au moyen d'une plurality d'etages des 



(57) A voice coder system is capable of coding at low bit 
rates under 4.8 kb/s with high speech quahty. Speech 
signals are divided into frames, and further divided into 
subframes. A spectral parameter calculator part 
calculates spectral parameters representing spectral 
features of the speech signals in at least one subframe, 
and a spectral parameter quantization part quantizes the 
spectral parameters of at least one subframe preselected 
by using a plurality of stages of quantization code books 
to obtain quantized spectral parameters. A mode 
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tables de codage de quantification pour obtenir des 
parametres spectraux quantifies. Un classificateur est 
utilise pour classifier les signaux vocaux du bloc selon 
une pluralite de modes en calculant des quantites 
predeterminees de caracteristiques de signal vocal et un 
ponderateur est utilise pour determiner les poids 
perceptifs des signaux vocaux a partir des parametres 
spectraax obtenus du calculateur de parametres 
spectraux pour produire des signaux ponderes. Une table 
de codage adaptative est utilisee pour obtenir des 
parametres representant les periodes des signaux vocaux 
dans un mode predetermine en utilisant la classification 
de mode du classificateur, les parametres spectraux 
obtenus du calculateur de parametres spectraux, les 
parametres spectraux quantifies obtenus du 
quantificateur de parametres spectraux et les signaux 
ponderes; un quantificateur d 'excitation explore une 
pluralite d'etages des tables de codage d'excitation et 
d'une table de codage de gain en utilisant les parametres 
spectraux, les parametres spectraux quantifies, les 
signaux ponderes et les periodes des signaux vocaux 
pour obtenir les signaux d*excitation quantifies des 
signaux vocaux. 



classifier part classifies the speech signals in the frame 
into a plurality of modes by calculating predetermined 
amounts of the speech signal features, and a weighting 
part weights perceptual weights to the speech signals by 
using the spectral parameters obtained in the spectral 
parameter calculator part to obtain weighted signals. An 
adaptive code book part obtains pitch parameters 
representing pitch periods of the speech signals in a 
predetermined mode by using the mode classification in 
the mode classifier part, the spectral parameters obtained 
in the spectral parameter calculator part, the quantized 
spectral parameters obtained in the spectral parameter 
quantization part, and the weighted signals; an excitation 
quantization part searches a plurality of stages of 
excitation code books and a gain code book by using the 
spectral parameters, the quantized spectral parameters, 
the weighted signals and the pitch parameters to obtain 
quantized excitation signals of the speech signals. 



Industrie Canada Industry Canada 



CA 021 13928 1997-07-16 



1 

VOICE CODER SYSTEM 



The present invention relates to a voice coder 
system for coding speech signals at low bit rates, 
particularly under 4.8 kb/s, with high quality. 

Conventionally, as a coder system for coding 
speech signals at low bit rates under 4.8 kb/s, a CELP 
(code-excited LPC coding) system has been known, as 
disclosed in various documents, for example: "Code-Excited 
Linear Prediction: High Quality Speech At Very Low Bit 
Rates" by M. Schroeder and B. Atal, Proc. ICASSP, pp. 939- 
940, 1985 (Document 1); "Improved Speech Quality And 
Efficient Vector Quantization in SELP" by Kleijin et al., 
Proc. ICASSP, pp. 155-158, 1988 (Document 2). In this 
system, a linear prediction analysis of speech signals is 
carried out for each frame (for example, 20 ms) on a 
transmitter side, to extract spectral parameters 
representing spectral characteristics of the speech signals. 
And the frame is further divided into subframes (for 
example, 5 ms) , and parameters such as delay parameters or 
gain parameters in an adaptive code book are extracted based 
on past excitation signals for each subframe. Then, by the 
adaptive code book, a pitch prediction of the speech signals 
of the subframe is executed and, against a residual signal 
obtained by the pitch prediction, an optimum excitation code 
vector is selected from an excitation code book (vector 
quantization code book) composed of predetermined types of 
noise signals so as to calculate an optimum gain. The 
selection of the optimum excitation code vector is conducted 
so as to minimize an error power between: (i) a signal 
synthesized from the selected noise signal, and (ii) the 
aforementioned residual signal. And an index, representing 
the type of the selected excitation code vector and the 
optimum gain, as well as the parameters extracted from the 
adaptive code book are transmitted. A description of the 
receiver side is omitted. 
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In the above-described conventional system, 
disclosed in Documents 1 and 2, a sufficiently-large (for 
example, 10 bits) excitation code book is required to obtain 
good speech quality. Accordingly, vast amounts of 
5 calculations are required for the search of the excitation 
code book. Further, the necessary memory capacity is also 
vast (for example, in case of 10 bits 4 0 dimensions, a 
memory capacity of 40 K words) , and thus it is difficult to 
realize such a system with compact hardware. Also, when 
10 increasing the frame length and the subframe length in order 
to reduce the bit rate and when increasing the dimension 
number without reducing the bit number of the excitation 
code book, the calculation amount is quite remarkably 
increased. 

15 One method for reducing the size of the code book 

is disclosed in "Multiple Stage Vector Quantization For 
Speech Coding" by B. Juang et al., Proc. ICASSP, pp. 597- 
600, 1982 (Docximent 3). This is a multiple-stage vector 
quantization method, wherein the code book is divided intp 

20 multiple stages of subcode books, and each subcode book is 
independently searched. In this method, since the code book 
>.s divided into a plurality of stages of the subcode books, 
the size of the subcode book for one stage is also reduced, 
for example, B/L bits (B represents the whole bit number, 

25 and L represents the stage number) , and thus the calculation 
amount required for the search of the code book is reduced 
to L X 2 in comparison with one stage of B bits. Further, 
the necessary memory capacity for storing the code book is 
also reduced. However, since in this method each stage of 

30 the subcode book is independently learned and searched, the 
performance is largely reduced as compared with one stage of 
B bits. 

It is therefore an object of the present invention 
to provide a voice coder system, free from the 
35 aforementioned problems of the prior art, which is capable 
of coding speech signals at low bit rates, particularly 
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under 4.8 kb/s, with good speech quality using a relatively 
small quantity of calculation and memory capacity. 

In accordance with one aspect of the present 
invention, there is provided a voice coder system, 
comprising spectral parameter calculator means for dividing 
input speech signals into frames, and for further dividing 
the speech signals into a plurality of subframes at every 
predetermined timing, and for calculating spectral 
parameters representing spectral features of the speech 
signals in at least one subframe; spectral parameter 
quantization means for quantizing the spectral parameters of 
at least one subframe preselected by using a plurality of 
stages of quantization code books to obtain quantized 
spectral parameters; mode classifier means for classifying 
the speech signals in the frame into a plurality of modes by 
calculating predetermined amounts of the speech signal 
features; weighting means for weighting perceptual weights 
to the speech signals, depending on the spectral parameters 
obtained in the spectral parameter calculator means, to 
obtain weighted signals; adaptive code book . means for 
obtaining pitch parameters representing pitches of the 
speech signals corresponding to the modes depending on the 
mode classification in the mode classifier means, the 
spectral parameters obtained in the spectral parameter 
calculator means, the quantized spectral parameters obtained 
in the spectral parameter quantization means and the 
weighted signals; and excitation quantization means for 
searching a plurality of stages of excitation code books and 
a gain code book depending on the spectral parameters, the 
quantized spectral parameters, the weighted signals and the 
pitch parameters, to obtain quantized excitation signals of 
the speech signals. 

In the voice coder system, the mode classifier 
means can include means for calculating pitch prediction 
distortions of the subframes from the weighted signals 
obtained in the weighting means, and means for executing the 
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mode classification by using a cumulative value of the pitch 
prediction distortions throughout the frame* 

In the voice coder system, the spectral parameter 
quantization means can include means for switching the 
5 quantization code books depending on the mode classification 
result in the mode classifier means when the spectral 
pareimeters are quantized. 

In the voice coder system, the excitation 
quantization means can include means for switching the 

10 excitation code books and the gain code book depending on 
the mode classification result in the mode classifier means 
when the excitation signals are quantized. 

In the excitation quantization means, at least one 
stage of the excitation code books includes at least one 

15 code book having a predetermined decimation rate. 

Next, the function of a voice coder system 
according to the present invention will be described. 

Input speech signals are divided into frames (for 
example, 40 ms) in a frame divider part, and each frame of 

2 0 the speech signals are further divided into subframes (for 
example, 8 ms) in a subframe divider part. In a spectral 
parameter calculator part, a well-known LPC analysis is 
applied to at least one subframe (for example, the first, 
third and/or fifth subframes of the 5 subframes) to obtain 

25 spectral parameters (LPC parameters) . In a spectral 
parameter quantization part, . the LPC parameters 
corresponding to a predetermined subframe (for example, the 
fifth subframe) are quantized by using a quantized code 
book. In this case, as the code book, any of . a vector 

30 quantized code book, a scalar quantized code book and a 
vector-scalar quantized code book can be used. 

Next, in a mode classifier part, predetermined 
feature amounts are calculated from the speech signals of 
the frame, and the obtained values are compared with 

35 predetermined threshold values. Based on the comparison 
results, the speech signals are classified into a plurality 
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Of mode types (for example, 4 types) every frame. Then, in 
a perceptual weighting part, by using the spectral 
parameters a^ (i = 1 to P) of the first, third and fifth 
subframes, perceptual weighting signals are calculated 
according to formula (1) every subframe. However, for 
example, the spectral parameters of the second and fourth 
subframes are calculated by a linear interpolation of the 
spectral parameters of the first and third subframes and of 
the third and fifth subframes, respectively. 



X.(z) = x(z)[a-Za , z-')/a-Ia , tj'z -)] (1) 

wherein x(z) and X,(z) represent z-transforms of the speech 
signals and the perceptual weighting signals of the frame, 
P represents a dimension of the spectral parameters and 77, 
Y represents a constant for controlling a perceptual 
weighting amount, for example, usually selected to 
approximately 1.0 and 0.8 respectively. 

Next, in an adaptive code book part, a delay T and 
a gain p, as parameters involved in pitch, are calculated 
against the perceptual weighting signals every subframe. in 
this case, the delay corresponds to a pitch period. 
Reference can be made to the aforementioned Document 2 for 
a calculation method for the parameters of the adaptive code 
book. Also, in order to improve the performance of the 
adaptive code book respecting a female speaker in 
particular, the delay for each subframe can be represented 
by a decimal value for every sampling time instead of an 
integer value. More specifically, a paper such as one 
entitled "Pitch predictors with high temporal resolution" by 
P. Kroon and B. Atal, Proc. ICASSP, pp. 661-664, 1990 
(Document 4) can be referred to. For example, by 
representing the delay amount of each subframe by an integer 
value, 7 bits are required. However, by representing the 
delay amount by a fractional value, the necessary bit number 



CA 021 (3928 1997-07-16 



6 

increases to approximately 8 bits but the female speech is 
remarkably improved. 

Further, in order to reduce the amount of 
calculation relating to the parameters of the adaptive code 
5 book against the perceptual weighting signals, firstly a 
plurality of types of proposed delays are obtained in order 
every subframe from maximizing formula (2) by an open loop 
search: 

D(T) = PUT)/Q(T) (2) 

where : 

P(T) = 2L(n)Xw(n-T) (3) 

n-0 

Q(T) = Zxw(n-T)^ (4) 

n-O 

10 As described above, at least one type of the proposed delay 
is obtained every subframe by the open loop search, and 
thereafter the neighborhood of this proposed value is 
searched every subframe by a closed loop search using drive 
excitation signals of a past frame to obtain a pitch period 

15 (delay) and a gain, (For more specifics on the method, 
refer to, for example, Japanese Patent Application No. Hei 
3-10326-2 (Document (5).) 

In a vocal section, the delay amount of the 
adaptive code book is extremely highly correlated between 

20 the sxibframes and, by taking a delay amount difference 
between the subframes and transmitting this difference, a 
transmission amount required for transmitting the delay of 
the adaptive code book can be largely reduced in comparison 
with a method for transmitting the delay amount for every 

25 subframe independently. For instance, when the delay amount 
represented by 8 bits is transmitted in the first subframe 

n 
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and the difference from the delay amount of the just- 
previous subfraroe is transmitted by 3 bits in the second to 
fifth sub frames in every frame, a transmission information 
amount can be reduced to 40 to 20 bits for each frame in 
5 comparison with a case where the delay amount is transmitted 
by 8 bits in all sub frames. 

Next, in an excitation quantization part, 
excitation code books composed of a plurality of stages of 
vector quantization code books are searched to select a code 

10 vector for every stage, so that an error power between the 
above-described weighting signal and a weighted reproduction 
signal calculated by each code vector in the excitation code 
books may be minimized. For example, when the excitation 
code books are composed of two stages of code books, the 

15 search of the code vector is carried out according to 
formula (5) as follows: 

D = zlxw(n) - )Sv(n-T) • hw(n) - r iCn(n) • h.(n) 

n-O 

-r2C2i(n) -hwCn))^ (5) 

In this formula, /9v(n-T) represents the adaptive code vector 
calculated in the closed loop search of the adaptive code 
book part, and ^ represents the gain of the adaptive code 

20 vector. And Cij(n) and C2i(n) represent the j-th and i-th 
vectors of the first and second code books, respectively. 
Also, h^(n) represents impulse responses indicating 
characteristics of the weighting filter of formula (6). 
Also, Yi and Y2 represent the optimum gains concerning the 

25 first and second code books, respectively. 

fUz) = [a-fa,7?'z-)/a-sV7*z-0]^ ] (6) 

wherein r? and y represent constants for controlling the 
perceptual weighting signals of formula (1). 
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Next, after the code vector for minimizing formula 
(5) of the excitation code books is searched, the gain code 
book is searched so as to minimize formula (7) as follows: 

D = zjxwCn) - ^v(n-T) i h wCn) -Tu Cn(n) * h.(n) 

- r2.C2i(n) * hw(n)]' (7) 

wherein fi^., Yak represent k-th gain code vectors of the two- 
5 dimensional gain code book. 

In order to reduce the calculation amount when 
searching the optimum code vectors of the excitation code 
books, a plurality of types of proposed excitation code 
vectors (for example, mj types for the first stage and ma 

10 types for the second stage) can be selected, and then all 
combinations (mj x mz) of the first and second stages of the 
proposed values can be searched to select a combination of 
the proposed values minimizing formula (5) • 

Also, the gain code book can be searched against 

15 all the combinations of the above-described proposed 
excitation code vectors or a predetermined number of the 
combinations of the proposed excitation code vectors 
selected from all the combinations in a small-number order 
of the error power, according to formula (7), to obtain the 

20 combination of the gain code vector and the excitation code 
vector for minimizing the error power. In this way, the 
calculation amount is increased but the performance can be 
improved . 

Next, in the mode classifier part, a cumulative 
25 pitch prediction distortion is used as the feature amount. 
Firstly, against the proposed pitch periods T selected every 
subframe by the open loop search in the adaptive code book 
part, pitch prediction error distortions as pitch prediction 
distortions are obtained every subframe according to formula 
30 (8) as follows: 
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D, = zL.Kn) -P,nT)/Q,(T) (8) 

wherein 1 represents the subframe number. And according to 
formula (9), the cumulative prediction error power of the 
whole frame is obtained and this value is compared with 
predetermined threshold values to classify the speech 
5 signals into a plurality of modes. 

D = (1/M)ZD^ (9) 

For example, when the modes are classified into 4 kinds, 3 
kinds of the threshold values are determined and the value 
of formula (9) is compared with the 3 kinds of the threshold 
values to carry out the mode classification. In this case, 

10 as the pitch prediction distortions, pitch prediction gains 
can be used in addition to the above description. 

In the spectral parameter quantization part, 
spectrum quantization code books with respect to training 
signals are prepared against some modes classified in the 

15 mode classifier part in advance and, when coding, the 
spectriim quantization code books are switched during 
operation by using the mode information. In this manner, a 
memory capacity for storing the code books is increased by 
the switching types, but it becomes equivalent to providing 

20 a larger size of code books as the whole sum. As a result, 
the performance can be improved without increasing the 
transmission information amount. 

In the excitation quantization part, the training 
signals are classified into the modes in advance and 

25 different excitation code books and gain code books are 
prepared for every predetermined mode in advance. When 
coding, the excitation code books and the gain code books 
are switched during operation by using the mode information, 
in this way, a memory capacity for storing the code books is 
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increased by the switching types, but it becomes eqaiivalent 
to providing a larger size of code books as the whole sura. 
Hence, the performance can be improved without increasing 
the transmission information amount. 
5 Further, in the excitation quantization part, at 

least one stage of a plurality of stages of the code books 
has a regular pulse construction with a decimation rate (for 
example, decimation rate = 2) whose code vector elements are 
predetermined. Now, assuming that the decimation rate = 1, 

10 a usual structure is obtained. By such a construction, the 
memory amount required for storing the excitation code books 
can be reduced to 1/decimation rate (for example, reduced to 
1/2 in case of a decimcition rate = 2). Also, the 
calculation amount required for the excitation code book 

15 search can be reduced to nearly below 1/decimation rate. 
Further, by decimating the elements of the excitation code 
vectors to make pulses, in vowel parts of the speech in 
particular, auditorily-important pitch pulses can be 
expressed well; thus the speech quality can be improved. 

20 The objects, features and advantages of the 

present invention will become more apparent from the 
consideration of the following detailed description, taken 
in conjunction with the accompanying drawings, in which: 

Figure 1 is a block diagram of a first embodiment 

25 of a voice coder system according to the present invention; 

Figure 2 is a block diagram of a second embodiment 
of a voice coder system according to the present invention; 

Figure 3 is a block diagram of a third embodiment 
of a voice coder system according to the present invention; 

3 0 Figure 4 is a block diagram of a fourth embodiment 

of a voice coder system according to the present invention; 
and 

Figure 5 is a timing chart showing a regular pulse 
used in the fourth embodiment shown in Figure 4. 
35 Referring now to the drawings, wherein like 

reference characters designate the same or corresponding 
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parts throughout the views and thus the repeated description 
thereof can be omitted for brevity, there is shown in Figure 
1 the first embodiment of a voice coder system according to 
the present invention. 

As shown in Figure 1, in the voice coder system, 
speech signals input from an input terminal 100 are divided 
into frames (for example, 40 ms for each frame) in a frame 
divider circuit lio and are further divided into subframes 
(for example, 8 ms for each subframe) , shorter than the 
frames, in a subframe divider circuit 120. 

In a spectral parameter calculator circuit 2 00, 
the speech signals of at least one subframe is covered with 
a long window (for example, 24 ms) , longer than the 
subframe, to cut out the speech, and the spectral parameters 
are calculated at a predetermined dimension (for example, 
dimension P = lO) . The spectral parameters largely vary in 
time in a transient interval, particularly between a 
consonant and a vowel, and hence it is desirable to carry 
out an analysis in short intervals. However, by such a 
short- interval analysis, the calculation amount required for 
the analysis increases and thus the spectral parameters are 
calculated against a L (> 1) number of subframes (for 
example, L = 3; the first, third and fifth subframes) within 
the frame. And in the not-analyzed sxibframes (such as the 
second and fourth subframes) , the respective spectral 
parameters for the second and fourth subframes are 
calculated by a LSP linear interpolation described 
hereinafter by using the spectral parameters of the first 
and third subframes and of the third and fifth subframes. 
In this case, for the calculation of the spectral 
parameters, a well-known LPC analysis, such as a Burg 
analysis, can be used. In this embodiment, the Burg 
analysis is used. The detail of the Burg analysis is 
described, for example, in a book entitled "Signal Analysis 
and System Identification" by Nakamizo, Corona Publishing 
Ltd., pp. 82-87, 1988 (Document 6). 
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Further, in the spectral parameter calculator 
circuit 200, linear prediction coefficients (i = 1 to 10) 
calculated by the Burg method are transformed into linear 
spectral pair (LSP) parameters suitable for quantization and 
5 interpolation. The conversion of the linear prediction 
factors to the LSP parameters, for example, is executed by 
using a method disclosed in a paper entitled "Speech 
Information Compression by Linear Spectral Pair (LSP) Speech 
Analysis Synthesizing System" by Sugamura et al.. Institute 

10 of Electronics and Communication Engineers of Japan 
Proceedings, J64-A, pp. 599-606, 1981 (Document 7). That 
is, the linear prediction factors obtained by the Burg 
method in the first, third and fifth svibframes are 
transformed into the LSP parameters, and the LSP parameters 

15 of the second and fourth subframes are calculated by linear 
interpolation. And the LSP parameters of the second and 
fourth subframes are restored to the linear prediction 
coefficients by an inverse transformation, and the linear 
prediction factors a^^ (i = 1 to 10, 1 = i to 5) of the first 

20 to fifth subframes are output to a perceptual weighting 
circuit 230. Also, the LSP parameters of the first to fifth 
subframes are fed to a spectral parameter quantization 
circuit 210 having a code book 211. 

In the spectral parameter cjuantization circuit 

25 210, the LSP parameters of the predetermined subframes are 
effectively quantized. In this embodiment, by using a 
vector quantization as the quantizing method, the LSP 
parameters of the fifth subframe are quantized. For the 
method of the vector quantization of the LSP parameters, 

30 well-known methods can be used. (For example, refer to 
Japanese Patent Application No. Hei 2-297600 (Document 8), 
Japanese Patent Application No. Hei 3-261925 (Document 9), 
Japanese Patent Application No. Hei 3-155049 (Docximent 10) .) 

Further, in the spectral parameter quantization 

35 circuit 210, based on the quantized LSP parameters of the 
fifth subframe, the LSP pareuneters of the first to fourth 

c 
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subframes are restored. In this emtrodiment, by the linear 
interpolation of the quantized LSP parameters of the fifth 
subframe in the present frame and the quantized LSP 
parameters of ^ the fifth subframe in one past frame, the LSP 
parameters of the first to fourth subframes are restored. 
That is, after one type of code vector for minimizing the 
LSP parameters before the quantization and for minimizing 
the error power of the LSP parameters after the quantization 
is selected, the LSP parameters of* the first to fourth 
subframes can be restored by the linear interpolation. In 
order to further improve the performance, after a plurality 
of proposed code vectors for minimizing the error powers are 
selected, a cumulative distortion for the proposed code 
vectors is evaluated according to formula (10) shown below, 
and a set of the proposed code vector for minimizing the 
cumulative distortion and interpolation LSP parameters can 
be selected. 

D Zc.bndspM - Isp',]^ (10) 

wherein Isp^i, Isp'^ represent the LSP parameters of the £-th 
subframe before the quantization and the LSP parameters of 
the i-th subframe restored after the quantization, 
respectively, and b^i represents the weighting factors 
obtained by applying formula (11) to the LSP parameters of 
the £-th subframe before the quantization. 

bi , = (l/[ISPi. , - Ispi-i. i]) 

+ (1/llSp , - Isp,. ,]) (11) 

Also, Ci is the weighting factors in the degree direction of 
the LSP parameters and, for instance, can be obtained by 
using formula (12) as follows: 



c, = 1.0(i = 1 to 8). 0.8(i = 9 to 10) 
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The LSP parameters of the first to fourth subframes, 
restored as described above and the quantized LSP parameters 
of the fifth subframe are transformed into linear prediction 
factors a'ii (i = 1 to 10, 1 = 1 to 5) every subframe, and 
5 the obtained linear prediction factors are output to an 
impulse response calculator circuit 310. Also, an index 
representing a code vector of the quantized LSP parameters 
of the fifth subframe is sent to a multiplexer (MUX) 400. 

In the above-described operation, in place of the 

10 linear interpolation, a predetermined bit number (for 
example, 2 bits) of storage patterns of the LSP parameters 
is prepared, and the LSP parameters of the first to fourth 
subframes are restored with respect to these patterns to 
evaluate formula (10). And a set of the code vector for 

15 minimizing formula (10) and the interpolation patterns can 
be selected. In this manner, the transmission information 
for the bit number of the storage patterns increases. 
However, the temporal change of the LSP parameters within 
the frame can be more precisely expressed. In this case, 

20 the storage patterns can be learned and prepared in advance 
by using the LSP parameter data for training, or 
predetermined patterns can be stored. 

In a mode classifier circuit 245, as feature 
amounts for carrying out a mode classification, prediction 

25 error powers -of the spectral parameters are used. The 
linear prediction factors for the 5 subframes, calculated in 
the spectral parameter calculator circuit 200 are input and 
transformed into K parameters, and a cumulative prediction 
error power E of the 5 subframes is calculated according to 

30 formula (13) as follows: 

E = l/5ZGi (13) 

wherein Gi is represented as follows: 
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G. = Pi -(nil-k,,^]) (14) 



In this formula, represents a power of the input signal of 
the first subframe. Next, the cumulative prediction error 
power E is compared with predetermined threshold values to 
classify the speech signals into a plurality of types of 
5 modes. For example, when classifying into four types of 
modes, the cumulative prediction error power is compared 
with three types of threshold values. The mode information 
obtained by the classification is output to an adaptive code 
book circuit 300 and the index (in case of four types of 

10 modes, 2 bits), representing the mode information, is output 
to the multiplexer 400. 

The perceptual weighting circuit 23 0 inputs the 
linear prediction factors (i = 1 to 10, 1 = 1 to 5) every 
subframe from the spectral parameter calculator circuit 200, 

15 and executes a perceptual weighting against the speech 
signals of the subframes according to formula (1) to output 
perceptual weighting signals. 

A response signal calculator circuit 24 0 inputs in 
each subframe the linear prediction factors au from the 

20 spectral parameter calculator circuit 200, also inputs in 
each subframe the linear prediction factors a'u, which are 
quantized and restored by the interpolation, from the 
spectral parameter quantization circuit 210, and calculates 
response signals X2(n) for one subframe by using values 

25 stored in a filter memory when it is considered that the 
input signal d(n) =0, and outputs the calculation result to 
a subtracter 250. In this case, the response signals X2(n) 
are shown by formula (15) as follows: 



X2(n) = d(n) - Za,7?*d(n-i) + Ea,r'y(n-i) 
i-i i-i 



+ ,X2(n-l) (15) 
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wherein y represents the same value as that indicated in 
formula (1) . 

The subtracter 250 subtracts the response signals 
of one subframe from the perceptual weighting signals 
5 according to forinula (16) to obtain x^'(n) which are sent to 
the adaptive code book circuit 300. 

Xw' (n) = Xw(n) - XsCn) (16) 

The impulse response calculator circuit 310 
calculates a predetermined point number L of impulse 
responses h^(n) of weighting filters, whose z-transform is 
10 represented by formula (17), and outputs the calculation 
result to the adaptive code book circuit 3 00 and a 
excitation quantization circuit 350 • 

Hw(z) = [a-Sai7?'z-»)/a-Sair*z-')] • [l/O-Sa .z"' )]..(17) 

The adaptive code book circuit 3 00 inputs the mode 
information from the mode classifier circuit 245, and 

15 obtains a pitch parameter only in the case of the 
predetermined mode. In this case, there are four modes and, 
assuming that the threshold values at the mode 
classification increases from mode 0 to mode 3, it is 
considered that mode 0 and modes 1 to 3 correspond to a 

20 consonant part and a vowel part, respectively. Hence, the 
adaptive code book circuit 300 is to seek the pitch 
parameters only in the case of mode 1 to mode 3. First, in 
an open loop search, against the output signals of the 
perceptual weighting circuit 230, a plurality of types' (for 

25 example, M kinds) of proposed integer delays for maximizing 
formula (2) every subframe are selected. Further, in a 
short delay area (for example, delay of 20 to 80), by using 
the aforementioned Document 4 against each proposed value, 
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near the integer delays, a plurality of types of proposed 
fractional delays are obtained, and lastly, at least one 
type of the proposed fractional delay for maximizing formula 
(2) is selected every subframe. In the following, for 
5 simplifying the description, it is assumed that the proposed 
number is one type, and one type of delay selected every 
subframe is dj. (1 = 1 to 5). Next, in a closed loop search, 
based on drive excitation signals v(n) of the past frame, 
formula (18) is evaluated every subframe against several 

10 predetermined points e near d^ to obtain the maximum delay 
every subframe, and an index representing the delay is 
output to the multiplexer 400. Also, according to formula 
(21) , adaptive code vectors are calculated to output the 
calculated adaptive code vectors to the excitation 

15 quantization circuit 350. 

D' (d, + £) = P'Hd , + e)m^ + e) '..(18) 

where : 

P Cd, + e) = Sx.'(n)[v {n - (d, + e)] -h-Cn)] (19) 

n-0 

Q(d, + e) = z[v In - (d, + f) } * hw (n)].f.... (20) 

wherein h^(n) is the output of the impulse response 
calculator circuit 310, and the symbol (:^) denotes the 
convolutional operation . 

q(n) = )S • V {n-(d, +£)) - h w(n) (21) 



20 



wherein: 

P = P' (d, + £)/Q(d, + e) 



(22) 
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Further, as described above in the function of the 
present invention, in a vocal section (for example, mode 1 
to mode 3) , a delay difference between the subframes can be 
taken, and the difference can be transmitted. In such a 
5 construction, for instance, 8 bits can be transmitted by the 
fractional delay of the first subframe in the frame and the 
delay difference from the previous subframe can be 
transmitted by 3 bits for each subframe in the second to 
fifth subframes. Also, at the open loop delay search time, 

10 in the second to fifth subframes, an approximate value of 
the delay of the previous frame is to be searched for 3 bits 
and the proposed delays are not further selected every 
subframe, but the cumulative error power for 5 subframes is 
obtained against the path of the 5 subframes of the proposed 

15 delays. And the path of the proposed delay for minimizing 
this cumulative error power is obtained to output the 
obtained path to the closed loop search. In the closed loop 
search, the neighbor of the delay value obtained by the 
closed loop search in the previous subframe is searched for 

20 3 bits to obtain the final- delay value, and the index 
corresponding to the obtained delay value every subframe is 
output to the multiplexer 400. 

The excitation quantization circuit 350 inputs the 
output signal of the subtracter 250, the output signal of 

25 the adaptive code book circuit 300 and the output signal of 
the impulse response calculator circuit 310, and initially 
carries out a search of a plurality of stages of vector 
quantization code books. In Figure 1, a plurality of types 
of the vector quantization code books are shown as 

30 excitation code books 351i to 351n. In the following 
explanations, for simplifying the description, it is assumed 
that the stages are determined to 2. The search of each 
stage of code vectors is carried out according to formula 
(23) obtained by correcting formula (5). 
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D = zlxw' (n) -qCn) -nCwCn) -luCn) 

-T2C.,(n) -h.di)]' (23) 

wherein x^' (n) is the output signal of the subtracter 250, 
Also, in mode 0, since the adaptive code book is not used, 
instead of formula (23) a code vector for minimizing formula 
(24) is searched. 

D = E[x;Cn) -r»cn(n) -h.Cn) -r2Cn(n) • hw(n)]^ (24) 



5 There are various methods for searching the first and second 
stages of code vectors for minimizing formula (23) . In this 
case, a plurality of proposed values are selected from the 
first and second stages, and thereafter a search of a set of 
both the proposed values is executed to decide a combination 

0 of the proposed values for minimizing the distortion of 
formula (23). Also, the first and second stages of the 
vector quantization code books are previously designed by 
using a large amount of speech database in consideration of 
the aforementioned searching method. The indexes I^i and 1^2 

5 of the first and second stages of the code vectors 
determined as described above are output to the multiplexer 
400. 

Further, the excitation quantization circuit 350 
also executes a search of a gain code book 355. In mode 1 
0 to mode 3 using the code books, the gain code book 355 
performs a search by using the determined indexes of the 
excitation code books 351i to 351^ so as to minimize formula 
(25). 



Dk = Z[Xw'(n) -)9'k •q(n) -r' u Cn (n) -hwdi) 

n-O 

-Tj'cn (n)-h,(n)r 



(25) 
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In this case, the gains of the adaptive code vectors and the 
gains of the first and second stages of the excitation code 
vectors are to be quantized by using the gain code book 355. 
Now, (Pt^, Yuj, ^.y21c) is its k-th code vector. In order to 
5 minimize formula (25) , for instance, a gain code vector for 
minimizing formula (25) against the whole gain code vectors 
(k = 0 to 2^-1) can be obtained. Alternatively, a plurality 
of types of proposed gain code vectors are preliminarily 
selected, and the gain code vector for minimizing formula 

10 (25) can be selected from the plurality of types. After the 
decision of the gain code vectors, an index Ig representing 
the selected gain code vector is output. On the other hand, 
in the mode not using the adaptive code book, the gain code 
book 355 is searched so as to minimize formula (26) as 

15 follows. In this case, a two-dimensional gain code book is 
used. 

Dk = six; Cn) -r' u Cu (n) • hw(n)-r\xC2i(n) • hXn)]^ (26) 

n -0 

A weighting signal calculator circuit 360 inputs 
the parameters output from the spectral parameter calculator 
circuit 200 and the respective indexes, and reads out the 
20 . code vectors corresponding to the indexes to calculate 
initially the drive excitation signals v(n) according to 
formula (27) as follows: 

v(n) = ;S'v(n-d) f r'iC,(n) + r'sCzCn) (27) 

However, in the mode not using the adaptive code book, it is 
considered that /? ' = 0. Next, by using the parameters 
25 output from the spectral parameter calculator circuit 200 
and the parameters output from the spectral parameter 
quantization circuit 210, the weighting signals S^{n) are 
calculated for each subframe according to formula (28) to 
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output the calculated weighting signals to the response 
signal calculator circuit 240. 

Svv(n) = v(n) - Z ai7?*v(n-i) + S a,r'p(n-i) 

I-O 1-0 

f Z a ,SvY(n-i) C28) 

1-0 

Figure 2 illustrates the second embodiment of a 
voice coder system according to the present invention. 
5 This embodiment concerns a mode classifier circuit 

410. In this embodiment, in place of the adaptive code book 
circuit 300 of the first embodiment, there is provided an 
adaptive code book circuit 420 including an open loop 
calculator circuit 421 and a closed loop calculator circuit 
10 422. 

In Figure 2, the open loop calculator circuit 421 
calculates at least one type of proposed delay every 
subframe according to formulas (2) and (3), and outputs the 
obtained proposed delay to the closed loop calculator 
15 circuit 422. Further, the open loop calculator circuit 421 
calculates the pitch prediction error power of formula (29) 
every subframe as follows: 

Pc, = Z*Xw,nn) -P,^(T)/Ql(T) (29) 

n-O 

The obtained is output to the mode classifier circuit 
410. 

20 The closed loop calculator circuit 422 inputs the 

mode information from the mode classifier circuit 245, at 
least one type of the proposed delay of every subframe from 
the open loop calculator circuit 421 and the perceptual 
weighting signals from the perceptual weighting circuit 230, 

25 and executes the same operation as the closed loop search 



n 
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part of the adaptive code book circuit 300 of the first 
embodiment. 

The mode classifier circuit 410 calculates the 
cumulative prediction error power as the characterizing 
5 amount according to formula (30) , and compares this 
cumulative prediction error power Eq with a plurality of 
types of threshold values to classify the speech signals 
into the modes, and the mode information is output. 

Eg = 1/5 Pci (30) 

1-1 

Figure 3 shows the third embodiment of a voice 
10 coder system according to the present invention. 

In this embodiment, as shown in Figure 3, a 
spectral parameter quantization circuit 450, including a 
plurality of types of quantization code books 4 51o to 451m-i 
for a spectral parameter quantization, inputs the mode 
15 information from the mode classifier circuit 445 and uses 
the quantization code books 451o to 451^-1 by switching the 
quantization code books in every predetermined mode. 

In the quantization code books 451o to 4 51m.i, a 
large amount of spectral parameters for training are 
20 classified into the modes in advance, and the quantization 
code books can be designed in every predetermined mode. In 
this embodiment, with such a construction, whole the. 
transmission information amount of the indexes of the 
quantized spectral parameters and the calculation amount of 
25 the code book search can be kept in the same manner as the 
first embodiment shown in Figure 1, it is nearly equivalent 
to several times a code book size; hence the performance of 
the spectral parameter quantization can be largely improved. 

Figure 4 illustrates the fourth embodiment of a 
30 voice coder system according to the present invention. 

In this embodiment, as shown in Figure 4, an 
excitation quantization circuit 470 includes M (M > 1) sets 
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Of N (N > 1) stages of excitation code books 471io to 471ih-i, 
excitation code books 471ho to 471m<.i (total N x M types) and 
M sets of gain code books 481o to 481h.i. In the excitation 
quantization ^:ircuit 470, by using the mode information 
output from the mode classifier circuit 245, in a 
predetermined mode, the N stages of the excitation code 
books in a predetermined j-th set within the M sets are 
selected and the gain code book of the predetermined j-th 
set is selected to carry out the" quantization of the 
excitation signals. 

When the excitation code books and the gain code 
books "are designed, a large amount of speech database is 
classified every mode in advance and, by using the above- 
described method, the code books can be designed every 
predetermined mode. By using these code books, while the 
excitation code books, the transmission information amount 
of the indexes of the gain code books and the calculation 
amount of the excitation code book search can be maintained 
in the same manner as the first embodiment shown in Figure 
1, it is nearly equivalent to M times the code book size; 
hence the performance of the excitation quantization can be 
largely improved. 

In the excitation quantization circuit 470 shown 
in Figure 4, the N stages of the code books are provided, 
and at least one stage of these code books has a regular 
pulse construction of a predetermined decimation rate, as 
shown in Figure 5. In Figure 5, one example of a decimation 
rate m = 2 is shown. By using the regular pulse 
construction, in a position where an amplitude is zero, the 
calculation processing is unnecessary; thus the calculation 
amount required for the code book search can be reduced to 
approximately 1/m. Further, there is no need to store the 
code books in the position where the amplitude is zero; 
hence the necessary memory amount for storing the code books 
can be reduced to approximately 1/m. The detail of the 
regular pulse construction is disclosed in a paper entitled 
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"A 6 kbps Regular Pulse CELP Coder for Mobile Radio 
Conuuunications" by M. Delprat et al., edited by Atal, Kluwer 
Academic Publishers, pp. 179-188, 1990 (Document 11); the 
detailed description can be omitted for brevity. The code 
5 books of the regular pulse construction are also trained in 
advance in the same manner as the above-described method. 

Further, the amplitude pattern of different phases 
are expressed as the patterns in common to design the code 
books; at the coding time, by using the code books by 

10 shifting only the phase temporally, in the case of m = 2, 
the memory amount and the calculation amount can be further 
reduced to 1/2. Moreover, in order to reduce the memory 
amount, a multi-pulse construction can be used in addition 
to the regular pulse construction. 

15 According to the present invention, various 

changes and modifications can be made outside the above- 
described embodiments. 

For example, first, as the spectral parameters, 
other well-known parameters can be used in addition to the 

20 LSP parameters. 

Further, in the spectral parameter calculator 
circuit 200, when the spectral parameters are calculated in 
at least one subframe within the frame, an RMS change or a 
power change between the previous subframe and the present 

25 subframe is measured; based on the change, the spectral 
parameters against a plurality of the large subframes can be 
calculated. In this manner, at the speech change point, the 
spectral parameters are necessarily analyzed and hence, even 
when the subframe number to be analyzed is reduced, the 

30 degradation of the performance can be prevented. 

For the quantization of the spectral parameters, 
a well-known method such as a vector quantization, a scalar 
quantization, or a vector-scalar quantization can be used. 

As to the selection of the interpolation pattern 

35 in the spectral parameter quantization circuit, other well- 
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known distance scales can be used in addition to formula 
(10). For instance, formula (31) can be used as follows: 

D= E^R, Sc.bndspn - Isp',]^ (31) 

wherein: 

Ri = RMSi/[E RMS.) (32) 

I-I 

In this formula, RMSi is the RMS or the power of the £-th 
sub frame. 

Further, in the excitation quantization circuit, 
the gains and Y2 can be equal in formulas (23) to (26). 
In this case, in the mode using the adaptive code books, the 
gain code book is of two-dimensional gain; in the mode not 
using the adaptive code books, the gain code book is of one- 
dimensional gain. Also, the stage number of the excitation 
code books, the bit number of the excitation code books of 
each stage, or the bit number of the gain code book can be 
changed every mode. For example, mode 0 can be of three 
stages and mode 1 to mode 3 can be of two stages. 

Moreover, for example, when the construction of 
the excitation code books is of two stages, the second stage 
of the code book is designed corresponding to the first 
stage of the code book, and the code books to be searched in 
the second stage can be switched depending on the code 
vector selected in the first stage. In this case, the 
memory amount is increased but the performance can be 
further improved. 

Also, in the search of the sound source code books 
and the training of the same, other well-known measures such 
as the distance measure can be used. 

Further, concerning the gain code book, the code 
book having a several-times-larger overall size than the 
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transmission bit number is trained in advance, and a partial 
area of this code book is assigned to a use area every 
predetermined mode. And, when coding, the use area can be 
used by switching the same, depending on the modes. 

Furthermore, although a convolutional calculation 
is carried out at the searches in the adaptive code book 
circuit and the excitation quantization circuit, as in 
formulas (19) to (21) and formulas (23) to (26), 
respectively, by using the impulse responses h„{n) , this can 
be also performed by a filtering calculation by using the 
weighting filter whose transfer characteristics can be 
represented by formula (6). In this way, the calculation 
amount is increased but the performance can be further 
improved . 

As described above, according to the present 
invention, the speech is classified into the modes by using 
the feature amount of the speech. The quantization methods 
of the spectral parameters, the operations of the adaptive 
code books and the excitation quantization methods are then 
switched depending on the modes. As a result, high speech 
quality can be obtained at lower bit rates as compared with 
the conventional system. 

While the present invention has been described 
with reference to particular illustrative embodiments, it is 
not to be restricted by those embodiments but only by the 
appended claims. It is to be appreciated that those skilled 
in the art can change or modify the embodiments without 
departing from the scope and spirit of the present 
invention. 
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THE EMBODIMENTS OF THE INVENTION IN WHICH AN EXCLUSIVE 
PROPERTY OR PRIVILEGE IS CLAIMED ARE DEFINED AS FOLLOWS: 

1. A voice coder system, comprising: 
spectral parameter calculator means for dividing 
input speech signals into frames and further dividing the 
speech signals into a plurality of subframes according to 
predetermined timing, and calculating spectral parameters 
representing spectral features of the speech signals in at 
least one subframe; 

spectral parameter quantization means for 
quantizing the spectral parameters of at least one subframe 
preselected by using a plurality of stages of quantization 
code books to obtain quantized spectral parameters; 

mode classifier means for classifying the speech 
signals in the frame into a plurality of modes by 
calculating predetermined feature amounts of the speech 
signals; 

weighting means for weighting perceptual weights 
to the speech signals depending on the spectral parameters 
obtained in the spectral parameter calculator means to 
obtain weighted signals; 

adaptive code book means for obtaining pitch 
parameters representing pitches of the speech signals 
corresponding to the modes depending on the mode 
classification in the mode classifier means, the spectral 
parameters obtained in the spectral parameter calculator 
means, the quantized spectral parameters obtained in the 
spectral parameter quantization means and the weighted 
signals; and, 

excitation quantization means for searching a 
plurality of stages of excitation code books and a gain code 
book depending on the spectral parameters, the quantized 
spectral parameters, the weighted signals and the pitch 
parameters to obtain quantized excitation signals of the 
speech signals; 
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Wherein the mode classifier means includes means for 
35 calculating pitch prediction distortions of the subframes 
from the weighted signals obtained in the weighting means 
and means for executing the mode classification by using a 
cumurative value of the pitch prediction distortions 
throughout the frame. 



2. A voice coder system, comprising: 

spectral parameter calculator means for dividing 
input speech signals into frames and further dividing the 
speech signals into a plurality of subframes according to 
5 predetermined timing, and calculating spectral parameters 
representing spectral features of the speech signals in at 
least one subframe; 

spectral parameter quantization means for 
quantizing the spectral parameters of at least one subframe 
10 preselected by using a plurality of stages of quantization 
code books to obtain quantized spectral parameters; 

mode classifier means for classifying the speech 
signals in the frame into a plurality of modes by 
calculating predetermined feature amounts of the speech 
15 signals ; 

weighting means for weighting perceptual weights 
to the speech signals depending on the spectral parameters 
obtained in the spectral parameter calculator means to 
obtain weighted signals; 

20 adaptive code book means for obtaining pitch 

parameters representing pitches of the speech signals 
corresponding to the modes depending on the mode 
classification in the mode classifier means, the spectral 
parameters obtained in the spectral parameter calculator 

25 means, the quantized spectral parameters obtained in the 
spectral parameter quantization means and the weighted 
signals; and, 

excitation quantization means for searching a 
plurality of stages of excitation code books and a gain code 
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30 book depending on the spectral parameters, the quantized 
spectral parameters, the weighted signals and the pitch 
parameters to obtain quantized excitation signals of the 
speech signals; 

wherein the spectral parameter quantization means includes 
35 means for switching the quantization code books depending on 
the mode classification result in the mode classifier means 
when the spectral parameters are quantized. 

3. A voice coder system, comprising: 

spectral parameter calculator means for dividing 
input speech signals into frames and further dividing the 
speech signals into a plurality of subframes according to 
5 predetermined timing, and calculating spectral parameters 
representing spectral features of the speech signals in at 
least one subframe; 

spectral parameter quantization means for 
quantizing the spectral parameters of at least one subframe 
10 preselected by using a plurality of stages of quantization 
code books to obtain quantized spectral parameters; 

mode classifier means for classifying the speech 
signals in the frame into a plurality of modes by 
calculating predetermined feature amounts of the speech 
15 signals; 

weighting means for weighting perceptual weights 
to the speech signals depending on the spectral parameters 
obtained in the spectral parameter calculator means to 
obtain weighted signals; 

20 adaptive code book means for obtaining pitch 

parameters representing pitches of the speech signals 
corresponding to the modes depending on the mode 
classification in the mode classifier means, the spectral 
parameters obtained in the spectral parameter calculator 

25 means, the quantized spectral parameters obtained in the 
spectral parameter quantization means and the weighted 
signals; and. 
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excitation quantization means for searching a 
plurality of stages of excitation code books and a gain code 
book depending on the spectral parameters, the quantized 
spectral parameters, the weighted signals and the pitch 
parameters to obtain quantized excitation signals of the 
speech signals; 

wherein the excitation quantization means includes means for 
switching the excitation code books and the gain code book 
depending on the mode classification result in the mode 
classifier means when the excitation signals are quantized. 

4.. A voice coder system, comprising: 
spectral parameter calculator means for dividing 
input speech signals into frames and further dividing the 
speech signals into a plurality of subframes according to 
predetermined timing, and calculating spectral parameters 
representing spectral features of the speech signals in at 
least one subframe; 

spectral parameter quantization means for 
quantizing the spectral parameters of at least one subframe 
preselected by using a plurality of stages of quantization 
code books to obtain quantized spectral parameters; 
/ mode classifier means for classifying the speech 

signals in the frame into a plurality of modes by 
calculating predetermined feature amounts of the speech 
signals; 

weighting means for weighting perceptual weights 
to the speech signals depending on the spectral parameters 
obtained in the spectral parameter calculator means to 
obtain weighted signals; 

adaptive code book means for obtaining pitch 
parameters representing pitches of the speech signals 
corresponding to the modes depending on the mode 
classification in the mode classifier means, the spectral 
parameters obtained in the spectral parameter calculator 
means, the quantized spectral parameters obtained in the 
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spectral parameter quantization means and the weighted 
signals; and, 

excitation quantization means for searching a 
plurality of stages of excitation code books and a gain code 
book depending on the spectral parameters, the quantized 
spectral parameters, the weighted signals and the pitch 
parameters to obtain quantized excitation signals of the 
speech signals; 

wherein in the excitation quantization means, at least one 
stage of the excitation code books includes at least one 
code book having a predetermined decimation rate. 

5. A voice coder system, comprising: 
a spectral parameter calculator for dividing a 
sequence of input speech signals into a plurality of frames 
and further dividing the speech signals into a plurality of 
subframes according to predetermined timing, and calculating 
spectral parameters representing a predetermined spectral 
characteristic of the speech signals in at least one of the 
subframes; 

a weighting unit for weighting a set of perceptual 
weights to the speech signals depending on the spectral 
parameters calculated by the spectral parameter calculator 
to obtain a set of weighted signals; 

a mode classifier including means for calculating 
a degree of pitch periodicity based on pitch prediction 
distortions calculated from the set of weighted signals and 
for determining one of a plurality of modes for each frame 
by using the degree of pitch periodicity; 

a spectral parameter quantization unit for 
quantizing the spectral parameters, said spectral parameter 
quantization unit including means for switching between a 
plurality of quantization code books, when the spectral 
parameters are quantized, depending on a mode classification 
result in the mode classifier; 
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an adaptive code book for obtaining a set of pitch 
parameters of the speech signals depending on the mode 
classification result in the mode classifier using the 
spectral parameters, the quantized spectral parameters and 
the set of weighted signals; and, 

an excitation quantization unit for searching a 
plurality of stages of excitation code books and a plurality 
of gain code books using the spectral parameters, the 
quantized spectral parameters and the set of weighted 
signals to obtain a set of quantized excitation signals of 
the speech signals, said excitation quantization unit 
including means for switching between a plurality of 
excitation code books and a plurality of gain code books 
depending on the mode determined by the mode classifier. 
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